Under the guidance of Prof. Haewoon Kwak
Team Members:
Dataset Details: The dataset contains the following columns:
For this project, we will be using a subset of the WASH dataset, which contains data from the years 2019, 2020, 2021, and 2022. This dataset has been merged with another dataset that includes GDP information for each country in the WASH dataset, enabling a comparative study against GDP. Below is the process followed to prepare the final dataset for data visualization.
!pip install jupyter-dash
Requirement already satisfied: jupyter-dash in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (0.4.2) Requirement already satisfied: dash in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-dash) (2.18.2) Requirement already satisfied: requests in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-dash) (2.31.0) Requirement already satisfied: flask in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-dash) (2.2.2) Requirement already satisfied: retrying in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-dash) (1.3.4) Requirement already satisfied: ipython in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-dash) (8.12.0) Requirement already satisfied: ipykernel in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-dash) (6.19.2) Requirement already satisfied: ansi2html in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-dash) (1.9.2) Requirement already satisfied: nest-asyncio in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-dash) (1.5.6) Requirement already satisfied: Werkzeug<3.1 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from dash->jupyter-dash) (2.2.3) Requirement already satisfied: plotly>=5.0.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from dash->jupyter-dash) (5.9.0) Requirement already satisfied: dash-html-components==2.0.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from dash->jupyter-dash) (2.0.0) Requirement already satisfied: dash-core-components==2.0.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from dash->jupyter-dash) (2.0.0) Requirement already satisfied: dash-table==5.0.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from dash->jupyter-dash) (5.0.0) Requirement already satisfied: importlib-metadata in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from dash->jupyter-dash) (6.0.0) Requirement already satisfied: typing-extensions>=4.1.1 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from dash->jupyter-dash) (4.12.2) Requirement already satisfied: setuptools in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from dash->jupyter-dash) (68.0.0) Requirement already satisfied: Jinja2>=3.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from flask->jupyter-dash) (3.1.2) Requirement already satisfied: itsdangerous>=2.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from flask->jupyter-dash) (2.0.1) Requirement already satisfied: click>=8.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from flask->jupyter-dash) (8.0.4) Requirement already satisfied: comm>=0.1.1 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipykernel->jupyter-dash) (0.1.2) Requirement already satisfied: debugpy>=1.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipykernel->jupyter-dash) (1.6.7) Requirement already satisfied: jupyter-client>=6.1.12 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipykernel->jupyter-dash) (7.4.9) Requirement already satisfied: matplotlib-inline>=0.1 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipykernel->jupyter-dash) (0.1.6) Requirement already satisfied: packaging in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipykernel->jupyter-dash) (23.0) Requirement already satisfied: psutil in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipykernel->jupyter-dash) (5.9.0) Requirement already satisfied: pyzmq>=17 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipykernel->jupyter-dash) (23.2.0) Requirement already satisfied: tornado>=6.1 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipykernel->jupyter-dash) (6.3.2) Requirement already satisfied: traitlets>=5.4.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipykernel->jupyter-dash) (5.7.1) Requirement already satisfied: backcall in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipython->jupyter-dash) (0.2.0) Requirement already satisfied: decorator in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipython->jupyter-dash) (5.1.1) Requirement already satisfied: jedi>=0.16 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipython->jupyter-dash) (0.18.1) Requirement already satisfied: pickleshare in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipython->jupyter-dash) (0.7.5) Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipython->jupyter-dash) (3.0.36) Requirement already satisfied: pygments>=2.4.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipython->jupyter-dash) (2.15.1) Requirement already satisfied: stack-data in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipython->jupyter-dash) (0.2.0) Requirement already satisfied: colorama in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from ipython->jupyter-dash) (0.4.6) Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from requests->jupyter-dash) (2.0.4) Requirement already satisfied: idna<4,>=2.5 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from requests->jupyter-dash) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from requests->jupyter-dash) (1.26.16) Requirement already satisfied: certifi>=2017.4.17 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from requests->jupyter-dash) (2023.7.22) Requirement already satisfied: six>=1.7.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from retrying->jupyter-dash) (1.16.0) Requirement already satisfied: parso<0.9.0,>=0.8.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jedi>=0.16->ipython->jupyter-dash) (0.8.3) Requirement already satisfied: MarkupSafe>=2.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from Jinja2>=3.0->flask->jupyter-dash) (2.1.1) Requirement already satisfied: entrypoints in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->ipykernel->jupyter-dash) (0.4) Requirement already satisfied: jupyter-core>=4.9.2 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->ipykernel->jupyter-dash) (5.3.0) Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->ipykernel->jupyter-dash) (2.8.2) Requirement already satisfied: tenacity>=6.2.0 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from plotly>=5.0.0->dash->jupyter-dash) (8.2.2) Requirement already satisfied: wcwidth in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython->jupyter-dash) (0.2.5) Requirement already satisfied: zipp>=0.5 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from importlib-metadata->dash->jupyter-dash) (3.11.0) Requirement already satisfied: executing in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from stack-data->ipython->jupyter-dash) (0.8.3) Requirement already satisfied: asttokens in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from stack-data->ipython->jupyter-dash) (2.0.5) Requirement already satisfied: pure-eval in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from stack-data->ipython->jupyter-dash) (0.2.2) Requirement already satisfied: platformdirs>=2.5 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-core>=4.9.2->jupyter-client>=6.1.12->ipykernel->jupyter-dash) (2.5.2) Requirement already satisfied: pywin32>=300 in c:\users\bhargavi jahagirdar\anaconda3\lib\site-packages (from jupyter-core>=4.9.2->jupyter-client>=6.1.12->ipykernel->jupyter-dash) (305.1)
import pandas as pd
df_gdp = pd.read_excel('reshaped_country_timeseries.xlsx')
df_gdp.head()
| Country Name | Code | Indicator Name | Year | GDP per capita (current US$) | |
|---|---|---|---|---|---|
| 0 | Afghanistan | AFG | GDP per capita (current US$) | 2019 | 500.522981 |
| 1 | Afghanistan | AFG | GDP per capita (current US$) | 2020 | 516.866797 |
| 2 | Afghanistan | AFG | GDP per capita (current US$) | 2021 | 363.674087 |
| 3 | Afghanistan | AFG | GDP per capita (current US$) | 2022 | 353.000000 |
| 4 | Albania | ALB | GDP per capita (current US$) | 2019 | 5396.214227 |
df_final = pd.read_excel('final.xlsx')
df_final.head()
| ISO3 | Country | Residence / Facility Type | Service Type | Year | Coverage | Population | Service level | |
|---|---|---|---|---|---|---|---|---|
| 0 | AFG | Afghanistan | total | Environmental cleaning | 2019 | 84.00000 | 3.172638e+07 | Basic service |
| 1 | AFG | Afghanistan | hospital | Environmental cleaning | 2019 | 79.11322 | 2.988067e+07 | Basic service |
| 2 | AFG | Afghanistan | non_hospital | Environmental cleaning | 2019 | 81.84787 | 3.091353e+07 | Basic service |
| 3 | AFG | Afghanistan | hospital | Hygiene | 2019 | 28.72340 | 1.084868e+07 | Basic service |
| 4 | AFG | Afghanistan | total | Sanitation | 2019 | 2.50000 | 9.442375e+05 | Basic service |
df_gdp.rename(columns={'Code':'ISO3'} , inplace=True)
df_gdp.head()
| Country Name | ISO3 | Indicator Name | Year | GDP per capita (current US$) | |
|---|---|---|---|---|---|
| 0 | Afghanistan | AFG | GDP per capita (current US$) | 2019 | 500.522981 |
| 1 | Afghanistan | AFG | GDP per capita (current US$) | 2020 | 516.866797 |
| 2 | Afghanistan | AFG | GDP per capita (current US$) | 2021 | 363.674087 |
| 3 | Afghanistan | AFG | GDP per capita (current US$) | 2022 | 353.000000 |
| 4 | Albania | ALB | GDP per capita (current US$) | 2019 | 5396.214227 |
final_dataset = pd.merge(df_final,df_gdp,on=['ISO3', 'Year'], how='inner')
final_dataset.head()
| ISO3 | Country | Residence / Facility Type | Service Type | Year | Coverage | Population | Service level | Country Name | Indicator Name | GDP per capita (current US$) | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AFG | Afghanistan | total | Environmental cleaning | 2019 | 84.00000 | 3.172638e+07 | Basic service | Afghanistan | GDP per capita (current US$) | 500.522981 |
| 1 | AFG | Afghanistan | hospital | Environmental cleaning | 2019 | 79.11322 | 2.988067e+07 | Basic service | Afghanistan | GDP per capita (current US$) | 500.522981 |
| 2 | AFG | Afghanistan | non_hospital | Environmental cleaning | 2019 | 81.84787 | 3.091353e+07 | Basic service | Afghanistan | GDP per capita (current US$) | 500.522981 |
| 3 | AFG | Afghanistan | hospital | Hygiene | 2019 | 28.72340 | 1.084868e+07 | Basic service | Afghanistan | GDP per capita (current US$) | 500.522981 |
| 4 | AFG | Afghanistan | total | Sanitation | 2019 | 2.50000 | 9.442375e+05 | Basic service | Afghanistan | GDP per capita (current US$) | 500.522981 |
final_dataset = final_dataset.drop(['Country Name'], axis=1)
final_dataset.head()
| ISO3 | Country | Residence / Facility Type | Service Type | Year | Coverage | Population | Service level | Indicator Name | GDP per capita (current US$) | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AFG | Afghanistan | total | Environmental cleaning | 2019 | 84.00000 | 3.172638e+07 | Basic service | GDP per capita (current US$) | 500.522981 |
| 1 | AFG | Afghanistan | hospital | Environmental cleaning | 2019 | 79.11322 | 2.988067e+07 | Basic service | GDP per capita (current US$) | 500.522981 |
| 2 | AFG | Afghanistan | non_hospital | Environmental cleaning | 2019 | 81.84787 | 3.091353e+07 | Basic service | GDP per capita (current US$) | 500.522981 |
| 3 | AFG | Afghanistan | hospital | Hygiene | 2019 | 28.72340 | 1.084868e+07 | Basic service | GDP per capita (current US$) | 500.522981 |
| 4 | AFG | Afghanistan | total | Sanitation | 2019 | 2.50000 | 9.442375e+05 | Basic service | GDP per capita (current US$) | 500.522981 |
final_dataset.shape
final_dataset.info()
final_dataset.nunique()
print(final_dataset['Coverage'].describe())
print(final_dataset['Service Type'].value_counts())
<class 'pandas.core.frame.DataFrame'> Int64Index: 40636 entries, 0 to 40635 Data columns (total 10 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 ISO3 40636 non-null object 1 Country 40636 non-null object 2 Residence / Facility Type 40636 non-null object 3 Service Type 40636 non-null object 4 Year 40636 non-null int64 5 Coverage 40636 non-null float64 6 Population 40636 non-null float64 7 Service level 40636 non-null object 8 Indicator Name 40636 non-null object 9 GDP per capita (current US$) 40636 non-null float64 dtypes: float64(3), int64(1), object(6) memory usage: 3.4+ MB count 40636.000000 mean 54.434492 std 43.730914 min 0.000000 25% 2.849000 50% 60.274000 75% 100.000000 max 100.000000 Name: Coverage, dtype: float64 Water 9296 Health care waste 8370 Sanitation 8326 Hygiene 7662 Environmental cleaning 6982 Name: Service Type, dtype: int64
Let's first analyze the global trend for the coverage of different service types across different facility types
import plotly.graph_objects as go
year_to_visualize = [2019, 2020, 2021, 2022]
combined_data = final_dataset[final_dataset['Year'].isin(year_to_visualize)].groupby(['Year', 'Residence / Facility Type']).agg({'Coverage': 'mean'}).reset_index()
pivot_data = combined_data.pivot(index='Residence / Facility Type', columns='Year', values='Coverage')
for year in year_to_visualize:
pivot_data[f'{year}_Diffs'] = [
", ".join(
f"{year_prev}: {pivot_data[year][i] - pivot_data[year_prev][i]:.2f}"
for year_prev in year_to_visualize if year_prev < year
)
if year > year_to_visualize[0] else "N/A"
for i in range(len(pivot_data))
]
base_color = 'Blues'
fig = go.Figure()
x = pivot_data.index
for i, year in enumerate(year_to_visualize):
tooltip_text = [
f"Year: {year}<br>Coverage: {pivot_data[year][facility]:.2f}%<br>"
f"Differences from preceding years: {pivot_data[f'{year}_Diffs'][facility]}"
for facility in pivot_data.index
]
fig.add_trace(go.Bar(
x=x,
y=pivot_data[year],
name=str(year),
hoverinfo="text",
hovertext=tooltip_text,
marker=dict(color=i / len(year_to_visualize), colorscale=base_color)
))
fig.update_layout(
title="Coverage of Residence / Facility Type by Year (with Differences from Preceding Years)",
xaxis_title="Residence / Facility Type",
yaxis_title="Coverage (%)",
barmode="group",
xaxis_tickangle=-45,
legend_title="Year",
template="plotly_white"
)
fig.show()
Let's check the trends for specific countries in specific years
from jupyter_dash import JupyterDash
from dash import dcc, html, Input, Output
JupyterDash._server_threads.clear()
JupyterDash._server_threads.clear()
countries = [{'label': country, 'value': country} for country in final_dataset['Country'].unique()]
years = [{'label': year, 'value': year} for year in final_dataset['Year'].unique()]
app1 = JupyterDash(name='ResidenceCoverage')
app1.layout = html.Div([
html.H1("Coverage of Residence / Facility Type by Year", style={'textAlign': 'center'}),
html.Div([
dcc.Dropdown(
id='country-dropdown-1',
options=countries,
placeholder="Select a Country",
style={'width': '48%', 'display': 'inline-block', 'margin-right': '2%'}
),
dcc.Dropdown(
id='year-dropdown-1',
options=years,
placeholder="Select Year(s)",
multi=True,
style={'width': '48%', 'display': 'inline-block'}
)
]),
dcc.Graph(id='residence-graph')
])
@app1.callback(
Output('residence-graph', 'figure'),
[Input('country-dropdown-1', 'value'),
Input('year-dropdown-1', 'value')]
)
def update_residence_graph(selected_country, selected_years):
if not selected_country or not selected_years:
return go.Figure(
layout={'title': "Select a country and year(s) to view the data"}
)
filtered_data = final_dataset[
(final_dataset['Country'] == selected_country) & (final_dataset['Year'].isin(selected_years))
]
if filtered_data.empty:
return go.Figure(
layout={'title': f"No data available for {selected_country} in selected year(s)"}
)
combined_data = filtered_data.groupby(['Year', 'Residence / Facility Type']).agg({'Coverage': 'mean'}).reset_index()
pivot_data = combined_data.pivot(index='Residence / Facility Type', columns='Year', values='Coverage').fillna(0)
fig = go.Figure()
for year in selected_years:
fig.add_trace(go.Bar(
x=pivot_data.index,
y=pivot_data[year],
name=str(year),
))
fig.update_layout(
title=f"Coverage for {selected_country} ({', '.join(map(str, selected_years))})",
xaxis_title="Residence / Facility Type",
yaxis_title="Coverage (%)",
barmode="group"
)
return fig
app1.run_server(mode="inline", port=8050)
C:\Users\Bhargavi Jahagirdar\anaconda3\Lib\site-packages\dash\dash.py:579: UserWarning: JupyterDash is deprecated, use Dash instead. See https://dash.plotly.com/dash-in-jupyter for more details.
Considering Ghana as an example, we can ssee that the coverage is higher for Urban faciltiy type than Rural facility type. Considering another example, we see that Bhutan has equal coverage for both the facility types. Through this visualization, we can dig deep on the coverage for each country individually
JupyterDash._server_threads.clear()
from jupyter_dash import JupyterDash
from dash import dcc, html, Input, Output
import plotly.graph_objects as go
import pandas as pd
countries = [{'label': country, 'value': country} for country in final_dataset['Country'].unique()]
years = [{'label': year, 'value': year} for year in sorted(final_dataset['Year'].unique())]
residence_types = [{'label': res, 'value': res} for res in final_dataset['Residence / Facility Type'].unique()]
service_type_colors = {
service_type: color
for service_type, color in zip(
final_dataset['Service Type'].unique(),
['#636EFA', '#EF553B', '#00CC96', '#AB63FA', '#FFA15A', '#19D3F3', '#FF6692', '#B6E880']
)
}
app = JupyterDash(name='ServiceLevels')
app.layout = html.Div([
html.H1("Population Distribution by Service Types", style={'textAlign': 'center'}),
html.Div([
html.Div([
html.Label("Select a Country:"),
dcc.Dropdown(
id='country-dropdown',
options=countries,
placeholder="Select a Country",
style={'width': '90%'}
)
], style={'width': '30%', 'display': 'inline-block', 'verticalAlign': 'top'}),
html.Div([
html.Label("Select Year(s):"),
dcc.Dropdown(
id='year-dropdown',
options=years,
placeholder="Select Year(s)",
multi=True,
style={'width': '90%'}
)
], style={'width': '30%', 'display': 'inline-block', 'verticalAlign': 'top'}),
html.Div([
html.Label("Select Residence / Facility Type:"),
dcc.Dropdown(
id='residence-dropdown',
options=residence_types,
placeholder="Select Residence / Facility Type",
multi=True,
style={'width': '90%'}
)
], style={'width': '30%', 'display': 'inline-block', 'verticalAlign': 'top'}),
], style={'marginBottom': '20px'}),
# Graphs for each service level
html.Div([
dcc.Graph(id='no-service-graph', style={'display': 'inline-block', 'width': '48%'}),
dcc.Graph(id='limited-service-graph', style={'display': 'inline-block', 'width': '48%'}),
]),
html.Div([
dcc.Graph(id='basic-service-graph', style={'display': 'inline-block', 'width': '48%'}),
dcc.Graph(id='insufficient-service-graph', style={'display': 'inline-block', 'width': '48%'}),
]),
])
def create_graph(filtered_data, service_level, default_title):
data = filtered_data[filtered_data['Service level'] == service_level]
if data.empty:
return go.Figure(layout={'title': f"No data available for {service_level}"})
grouped_data = data.groupby('Service Type')['Population'].sum().reset_index()
grouped_data = grouped_data[grouped_data['Population'] > 0] # Exclude zero population rows
fig = go.Figure(
data=[
go.Bar(
x=grouped_data['Service Type'],
y=grouped_data['Population'],
marker_color=[service_type_colors[stype] for stype in grouped_data['Service Type']]
)
],
layout={
'title': default_title,
'xaxis_title': 'Service Type',
'yaxis_title': 'Population'
}
)
return fig
@app.callback(
[Output('no-service-graph', 'figure'),
Output('limited-service-graph', 'figure'),
Output('basic-service-graph', 'figure'),
Output('insufficient-service-graph', 'figure')],
[Input('country-dropdown', 'value'),
Input('year-dropdown', 'value'),
Input('residence-dropdown', 'value')]
)
def update_graphs(selected_country, selected_years, selected_residence):
filtered_data = final_dataset.copy()
if selected_country:
filtered_data = filtered_data[filtered_data['Country'] == selected_country]
if selected_years:
filtered_data = filtered_data[filtered_data['Year'].isin(selected_years)]
if selected_residence:
filtered_data = filtered_data[filtered_data['Residence / Facility Type'].isin(selected_residence)]
no_service_fig = create_graph(filtered_data, 'No service', 'Population Distribution: No Service Level')
limited_service_fig = create_graph(filtered_data, 'Limited service', 'Population Distribution: Limited Service Level')
basic_service_fig = create_graph(filtered_data, 'Basic service', 'Population Distribution: Basic Service')
insufficient_service_fig = create_graph(filtered_data, 'Insufficient data', 'Population Distribution: Insufficient data Level')
return no_service_fig, limited_service_fig, basic_service_fig, insufficient_service_fig
app.run_server(mode="inline", port=8056)
JupyterDash._server_threads.clear()
year_to_visualize = [2019, 2020, 2021, 2022]
combined_data = final_dataset[final_dataset['Year'].isin(year_to_visualize)].groupby(['Year', 'Service Type']).agg({'Coverage': 'mean'}).reset_index()
pivot_data = combined_data.pivot(index='Service Type', columns='Year', values='Coverage')
for year in year_to_visualize:
pivot_data[f'{year}_Diffs'] = [
", ".join(
f"{year_prev}: {pivot_data[year][i] - pivot_data[year_prev][i]:.2f}"
for year_prev in year_to_visualize if year_prev < year
)
if year > year_to_visualize[0] else "N/A"
for i in range(len(pivot_data))
]
base_color = 'Blues'
fig = go.Figure()
x = pivot_data.index
for i, year in enumerate(year_to_visualize):
tooltip_text = [
f"Year: {year}<br>Coverage: {pivot_data[year][service]:.2f}%<br>"
f"Differences from preceding years: {pivot_data[f'{year}_Diffs'][service]}"
for service in pivot_data.index
]
fig.add_trace(go.Bar(
x=x,
y=pivot_data[year],
name=str(year),
hoverinfo="text",
hovertext=tooltip_text,
marker=dict(color=i / len(year_to_visualize), colorscale=base_color)
))
fig.update_layout(
title="Global Average Coverage of Service Types by Year (with Differences from Preceding Years)",
xaxis_title="Service Type",
yaxis_title="Average Coverage (%)",
barmode="group",
xaxis_tickangle=-45,
legend_title="Year",
template="plotly_white"
)
fig.show()
JupyterDash._server_threads.clear()
app2 = JupyterDash(name='ServiceTypeCoverage')
app2.layout = html.Div([
html.H1("Coverage of Service Types by Year", style={'textAlign': 'center'}),
html.Div([
dcc.Dropdown(
id='country-dropdown-2',
options=countries,
placeholder="Select a Country",
style={'width': '48%', 'display': 'inline-block', 'margin-right': '2%'}
),
dcc.Dropdown(
id='year-dropdown-2',
options=years,
placeholder="Select Year(s)",
multi=True,
style={'width': '48%', 'display': 'inline-block'}
)
]),
dcc.Graph(id='service-graph')
])
@app2.callback(
Output('service-graph', 'figure'),
[Input('country-dropdown-2', 'value'),
Input('year-dropdown-2', 'value')]
)
def update_service_graph(selected_country, selected_years):
if not selected_country or not selected_years:
return go.Figure(
layout={'title': "Select a country and year(s) to view the data"}
)
filtered_data = final_dataset[
(final_dataset['Country'] == selected_country) & (final_dataset['Year'].isin(selected_years))
]
if filtered_data.empty:
return go.Figure(
layout={'title': f"No data available for {selected_country} in selected year(s)"}
)
combined_data = filtered_data.groupby(['Year', 'Service Type']).agg({'Coverage': 'mean'}).reset_index()
pivot_data = combined_data.pivot(index='Service Type', columns='Year', values='Coverage').fillna(0)
fig = go.Figure()
for year in selected_years:
fig.add_trace(go.Bar(
x=pivot_data.index,
y=pivot_data[year],
name=str(year),
))
fig.update_layout(
title=f"Coverage for {selected_country} ({', '.join(map(str, selected_years))})",
xaxis_title="Service Type",
yaxis_title="Coverage (%)",
barmode="group"
)
return fig
app2.run_server(mode="inline", port=8051)
C:\Users\Bhargavi Jahagirdar\anaconda3\Lib\site-packages\dash\dash.py:579: UserWarning: JupyterDash is deprecated, use Dash instead. See https://dash.plotly.com/dash-in-jupyter for more details.
import plotly.express as px
JupyterDash._server_threads.clear()
years = [{'label': year, 'value': year} for year in final_dataset['Year'].unique()]
service_types = [{'label': service, 'value': service} for service in final_dataset['Service Type'].unique()]
app3 = JupyterDash("Viz3")
app3.layout = html.Div([
html.H1("Coverage of Services", style={'textAlign': 'center'}),
html.Div([
dcc.Dropdown(
id='year-dropdown',
options=years,
placeholder="Select a Year",
style={'width': '48%', 'display': 'inline-block', 'margin-right': '2%'}
),
dcc.Dropdown(
id='service-type-dropdown',
options=service_types,
placeholder="Select a Service Type",
style={'width': '48%', 'display': 'inline-block'}
)
]),
dcc.Graph(id='choropleth-map')
])
@app3.callback(
Output('choropleth-map', 'figure'),
[Input('year-dropdown', 'value'),
Input('service-type-dropdown', 'value')]
)
def update_choropleth(year, service_type):
if not year or not service_type:
return px.scatter(title="Select both year and service type to view the map")
map_df = final_dataset[(final_dataset['Year'] == year) & (final_dataset['Service Type'] == service_type)]
if map_df.empty:
return px.scatter(title=f"No data available for {service_type} in {year}")
fig = px.choropleth(map_df, locations="ISO3", color="Coverage",
hover_name="Country",
title=f"Coverage of {service_type} in {year}",
color_continuous_scale="viridis")
return fig
app3.run_server(mode='inline', port=8052)
C:\Users\Bhargavi Jahagirdar\anaconda3\Lib\site-packages\dash\dash.py:579: UserWarning: JupyterDash is deprecated, use Dash instead. See https://dash.plotly.com/dash-in-jupyter for more details.
JupyterDash._server_threads.clear()
years = [{'label': year, 'value': year} for year in final_dataset['Year'].unique()]
facility_types = [{'label': facility, 'value': facility} for facility in final_dataset['Residence / Facility Type'].unique()]
app4 = JupyterDash(__name__)
app4.layout = html.Div([
html.H1("Dynamic Choropleth Map for Facility and Residence Type", style={'textAlign': 'center'}),
html.Div([
dcc.Dropdown(
id='year-dropdown',
options=years,
placeholder="Select a Year",
style={'width': '48%', 'display': 'inline-block', 'margin-right': '2%'}
),
dcc.Dropdown(
id='facility-type-dropdown',
options=facility_types,
placeholder="Select a Facility/Residence Type",
style={'width': '48%', 'display': 'inline-block'}
)
]),
dcc.Graph(id='choropleth-map')
])
@app4.callback(
Output('choropleth-map', 'figure'),
[Input('year-dropdown', 'value'),
Input('facility-type-dropdown', 'value')]
)
def update_choropleth(year, facility_type):
if not year or not facility_type:
return px.scatter(title="Select both year and facility type to view the map")
map_df = final_dataset[(final_dataset['Year'] == year) & (final_dataset['Residence / Facility Type'] == facility_type)]
if map_df.empty:
return px.scatter(title=f"No data available for {facility_type} in {year}")
fig = px.choropleth(map_df, locations="ISO3", color="Coverage",
hover_name="Country",
title=f"Coverage for {facility_type} in {year}",
color_continuous_scale="viridis") # Changed to Viridis
return fig
app4.run_server(mode='inline', port=8053)
temp_df = final_dataset.groupby(['Year', 'Service Type'])['Coverage'].mean().reset_index()
fig = px.line(
temp_df,
x='Year',
y='Coverage',
color='Service Type',
markers=True,
title='Temporal Trends in Service Coverage',
labels={
'Year': 'Year',
'Coverage': 'Average Coverage (%)',
'Service Type': 'Service Type'
},
hover_data={
'Year': True,
'Coverage': ':.2f',
'Service Type': True
}
)
fig.update_xaxes(
tickmode='array',
tickvals=[2019, 2020, 2021, 2022],
range=[2018.5, 2022.5]
)
fig.update_layout(
title={'font': {'size': 16}},
xaxis_title={'font': {'size': 12}},
yaxis_title={'font': {'size': 12}},
legend_title={'font': {'size': 12}},
template='seaborn'
)
fig.show()
import pandas as pd
import plotly.express as px
from dash import Dash, dcc, html, Input, Output
# Ensure 'Year' is numeric
final_dataset['Year'] = final_dataset['Year'].astype(int)
# Group and reshape the data (do not calculate the coverage yet)
df_grouped = final_dataset.groupby(['Year', 'Country', 'Service Type', 'Service level'])['Coverage'].mean().reset_index()
# Create a Dash App
app = Dash(__name__)
app.layout = html.Div([
html.H1("Population Coverage by Service Levels"),
html.Label("Select Year:"),
dcc.Dropdown(
id='year-dropdown',
options=[{'label': year, 'value': year} for year in sorted(df_grouped['Year'].unique())],
value=sorted(df_grouped['Year'].unique())[0], # Default to the first year
placeholder="Select a year..."
),
html.Label("Select Country:"),
dcc.Dropdown(
id='country-dropdown',
options=[{'label': country, 'value': country} for country in df_grouped['Country'].unique()],
multi=True,
value=df_grouped['Country'].unique()[:3], # Default selected countries
placeholder="Select countries..."
),
dcc.Graph(id='bar-chart')
])
@app.callback(
Output('bar-chart', 'figure'),
[Input('year-dropdown', 'value'),
Input('country-dropdown', 'value')]
)
def update_chart(selected_year, selected_countries):
# Filter data for the selected year and countries
filtered_data = df_grouped[
(df_grouped['Year'] == selected_year) &
(df_grouped['Country'].isin(selected_countries))
]
# Calculate coverage after filtering
coverage_per_year = final_dataset[
(final_dataset['Year'] == selected_year) &
(final_dataset['Country'].isin(selected_countries))
].groupby(['Year', 'Country', 'Service Type', 'Service level'])['Coverage'].mean().reset_index()
# Generate simple bar chart
fig = px.bar(
coverage_per_year,
x='Country',
y='Coverage',
color='Service level',
facet_col='Service Type',
title=f"Population Coverage by Service Levels for {selected_year}",
labels={'Coverage': 'Percentage Coverage'},
barmode='group',
height=600
)
# Remove "Service Type=" from facet headers
for annotation in fig.layout.annotations:
if "Service Type=" in annotation.text:
annotation.text = annotation.text.replace("Service Type=", "")
# Update layout for better visualization
fig.update_layout(
xaxis_tickangle=45,
legend_title="Service Level",
title_font_size=20
)
return fig
if __name__ == '__main__':
app.run_server(debug=True, port=8058)
import pandas as pd
import plotly.graph_objects as go
from dash import Dash, dcc, html, Input, Output
# Ensure 'Year' is numeric
final_dataset['Year'] = final_dataset['Year'].astype(int)
# Create a Dash App
app = Dash(__name__)
# Layout for the Dash app
app.layout = html.Div([
html.H1("Coverage by Service Type, Service Level, and Country"),
html.Label("Select Year:"),
dcc.Dropdown(
id='year-dropdown',
options=[{'label': year, 'value': year} for year in sorted(final_dataset['Year'].unique())],
value=sorted(final_dataset['Year'].unique())[0], # Default to the first year
placeholder="Select a year..."
),
html.Label("Select Service Level:"),
dcc.Dropdown(
id='service-level-dropdown',
options=[{'label': level, 'value': level} for level in final_dataset['Service level'].unique()],
value=final_dataset['Service level'].unique()[0], # Default to the first service level
placeholder="Select a service level..."
),
html.Label("Select Service Type:"),
dcc.Dropdown(
id='service-type-dropdown',
options=[{'label': service_type, 'value': service_type} for service_type in final_dataset['Service Type'].unique()],
value=final_dataset['Service Type'].unique()[0], # Default to the first service type
placeholder="Select a service type..."
),
dcc.Graph(id='choropleth-map')
])
@app.callback(
Output('choropleth-map', 'figure'),
[Input('year-dropdown', 'value'),
Input('service-level-dropdown', 'value'),
Input('service-type-dropdown', 'value')]
)
def update_map(selected_year, selected_service_level, selected_service_type):
# Filter data for the selected year, service level, and service type
filtered_data = final_dataset[
(final_dataset['Year'] == selected_year) &
(final_dataset['Service level'] == selected_service_level) &
(final_dataset['Service Type'] == selected_service_type)
]
# Create a dictionary to map countries to coverage
country_coverage = filtered_data.groupby('Country')['Coverage'].mean().reset_index()
# Create the choropleth map
fig = go.Figure(go.Choropleth(
locations=country_coverage['Country'],
locationmode='country names',
z=country_coverage['Coverage'],
hoverinfo='location+z',
colorbar_title="Coverage",
colorscale="Viridis"
))
# Update the layout of the map
fig.update_layout(
title=f"Coverage for {selected_service_type} - {selected_service_level} ({selected_year})",
geo=dict(showcoastlines=True, coastlinecolor="Black", projection_type="mercator"),
title_font_size=20,
width=900,
height=500
)
return fig
if __name__ == '__main__':
app.run_server(debug=True, port= 8059)
Is there any positive/negative correlation between population size and coverage?
what is the overall correlation by each service type
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load the dataset
data = final_dataset
# Convert Population column to more understandable form (e.g., millions)
#data['Population'] = data['Population'] / 1e6
# Get unique service types
service_types = data['Service Type'].unique()
# Plot a scatter plot for each service type
for service_type in service_types:
plt.figure(figsize=(12, 8))
service_data = data[data['Service Type'] == service_type]
sns.scatterplot(
data=service_data,
x='Population',
y='Coverage',
hue='Service Type',
palette='viridis',
s=100,
alpha=0.8
)
# Add labels and title
plt.title(f'Correlation Between Population Size and Coverage for {service_type}', fontsize=16)
plt.xlabel('Population Size (in millions)', fontsize=14)
plt.ylabel('Percentage Coverage', fontsize=14)
plt.grid(True)
# Show the plot
plt.show()
# Optionally, calculate the correlation coefficient for each service type
for service_type in service_types:
service_data = data[data['Service Type'] == service_type]
correlation = service_data[['Population', 'Coverage']].corr().iloc[0, 1]
print(f"Correlation coefficient for {service_type}: {correlation:.2f}")
Correlation coefficient for Environmental cleaning: 0.15 Correlation coefficient for Hygiene: 0.15 Correlation coefficient for Sanitation: 0.15 Correlation coefficient for Water: 0.17 Correlation coefficient for Health care waste: 0.16
import pandas as pd
import matplotlib.pyplot as plt
# Load the dataset
data = final_dataset
# Convert Population column to more understandable form (e.g., millions)
#data['Population'] = data['Population'] / 1e6
# Ensure the Year column is treated as an integer
data['Year'] = data['Year'].astype(int)
# Filter data for the relevant service types and service level
sanitation_water_data = data[(data['Service Type'].isin(['Sanitation', 'Water', 'Hygiene', 'Environmental cleaning', 'Health care waste'])) & (data['Service level'] == 'No service')]
# Group data by year and service type, summing population
grouped_data = sanitation_water_data.groupby(['Year', 'Service Type'])['Population'].sum().unstack(fill_value=0)
# Plot trends over time
plt.figure(figsize=(12, 8))
grouped_data.plot(kind='line', marker='o', figsize=(14, 8), linewidth=2)
# Dynamically set x-axis ticks based on unique years
plt.xticks(ticks=grouped_data.index, labels=grouped_data.index.astype(str), fontsize=12)
# Add labels and title
plt.title('Population Without Access to Service (No Service) by year', fontsize=16)
plt.xlabel('Year', fontsize=14)
plt.ylabel('Population (in millions)', fontsize=14)
plt.legend(title='Service Type', fontsize=12)
plt.grid(True)
# Show the plot
plt.tight_layout()
plt.show()
<Figure size 1200x800 with 0 Axes>
final_dataset.head()
| ISO3 | Country | Residence / Facility Type | Service Type | Year | Coverage | Population | Service level | Indicator Name | GDP per capita (current US$) | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AFG | Afghanistan | total | Environmental cleaning | 2019 | 84.00000 | 3.172638e+07 | Basic service | GDP per capita (current US$) | 500.522981 |
| 1 | AFG | Afghanistan | hospital | Environmental cleaning | 2019 | 79.11322 | 2.988067e+07 | Basic service | GDP per capita (current US$) | 500.522981 |
| 2 | AFG | Afghanistan | non_hospital | Environmental cleaning | 2019 | 81.84787 | 3.091353e+07 | Basic service | GDP per capita (current US$) | 500.522981 |
| 3 | AFG | Afghanistan | hospital | Hygiene | 2019 | 28.72340 | 1.084868e+07 | Basic service | GDP per capita (current US$) | 500.522981 |
| 4 | AFG | Afghanistan | total | Sanitation | 2019 | 2.50000 | 9.442375e+05 | Basic service | GDP per capita (current US$) | 500.522981 |
Dataset with Residence / Facility Type = total
total_residence_data = final_dataset[final_dataset['Residence / Facility Type'] == 'total']
total_residence_data.head()
| ISO3 | Country | Residence / Facility Type | Service Type | Year | Coverage | Population | Service level | Indicator Name | GDP per capita (current US$) | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AFG | Afghanistan | total | Environmental cleaning | 2019 | 84.0 | 31726380.0 | Basic service | GDP per capita (current US$) | 500.522981 |
| 4 | AFG | Afghanistan | total | Sanitation | 2019 | 2.5 | 944237.5 | Basic service | GDP per capita (current US$) | 500.522981 |
| 5 | AFG | Afghanistan | total | Water | 2019 | 79.0 | 29837905.0 | Basic service | GDP per capita (current US$) | 500.522981 |
| 8 | AFG | Afghanistan | total | Health care waste | 2019 | 82.0 | 30970990.0 | Basic service | GDP per capita (current US$) | 500.522981 |
| 12 | AFG | Afghanistan | total | Sanitation | 2019 | 92.0 | 34747940.0 | Limited service | GDP per capita (current US$) | 500.522981 |
Creating a copy of the main dataset
from copy import deepcopy
total_residence_data_copy = deepcopy(total_residence_data)
total_residence_data_copy.head()
| ISO3 | Country | Residence / Facility Type | Service Type | Year | Coverage | Population | Service level | Indicator Name | GDP per capita (current US$) | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AFG | Afghanistan | total | Environmental cleaning | 2019 | 84.0 | 31726380.0 | Basic service | GDP per capita (current US$) | 500.522981 |
| 4 | AFG | Afghanistan | total | Sanitation | 2019 | 2.5 | 944237.5 | Basic service | GDP per capita (current US$) | 500.522981 |
| 5 | AFG | Afghanistan | total | Water | 2019 | 79.0 | 29837905.0 | Basic service | GDP per capita (current US$) | 500.522981 |
| 8 | AFG | Afghanistan | total | Health care waste | 2019 | 82.0 | 30970990.0 | Basic service | GDP per capita (current US$) | 500.522981 |
| 12 | AFG | Afghanistan | total | Sanitation | 2019 | 92.0 | 34747940.0 | Limited service | GDP per capita (current US$) | 500.522981 |
Creating an aggregate of the coverage and total population based on rows that have similar ISO3 code, Service Type and Year thus compiling the results for various kinds of service level
total_residence_data_copy['total_coverage_population'] = (total_residence_data_copy['Coverage'] * total_residence_data_copy['Population']) / 100
# Group by ISO3, Country, Service Type, and Year and calculate required aggregations
aggregated_data = total_residence_data_copy.groupby(['ISO3', 'Country', 'Service Type', 'Year', 'GDP per capita (current US$)']).agg(
total_population=('Population', 'sum'),
total_coverage_population=('total_coverage_population', 'sum')
).reset_index()
# Calculate the Coverage column in the new DataFrame
aggregated_data['Coverage'] = (aggregated_data['total_coverage_population'] * 100 / aggregated_data['total_population']).round(1)
# Rename the total_population column to Population
aggregated_data.rename(columns={'total_population': 'Population'}, inplace=True)
# Select and reorder columns for the final DataFrame
cleaned_dataset = aggregated_data[['ISO3', 'Country', 'Service Type', 'Year', 'Coverage', 'Population', 'GDP per capita (current US$)']]
cleaned_dataset.head()
| ISO3 | Country | Service Type | Year | Coverage | Population | GDP per capita (current US$) | |
|---|---|---|---|---|---|---|---|
| 0 | AFG | Afghanistan | Environmental cleaning | 2019 | 73.1 | 37769500.0 | 500.522981 |
| 1 | AFG | Afghanistan | Environmental cleaning | 2020 | 73.1 | 38972232.0 | 516.866797 |
| 2 | AFG | Afghanistan | Environmental cleaning | 2021 | 73.1 | 40099464.0 | 363.674087 |
| 3 | AFG | Afghanistan | Environmental cleaning | 2022 | 73.1 | 41128772.0 | 353.000000 |
| 4 | AFG | Afghanistan | Health care waste | 2019 | 70.5 | 37769500.0 | 500.522981 |
from jupyter_dash import JupyterDash
from dash import dcc, html, Input, Output
import plotly.graph_objects as go
import pandas as pd
JupyterDash._server_threads.clear()
Visualization for population vs coverage
Dividing the graph into top 20 and bottom 50 to get a better view of the data
from jupyter_dash import JupyterDash
from dash import dcc, html, Input, Output
import plotly.express as px
import pandas as pd
# Initialize the app
app = JupyterDash(__name__)
# Dropdown options
service_types = [{'label': service, 'value': service} for service in cleaned_dataset['Service Type'].unique()]
# App layout
app.layout = html.Div([
html.H1("Interactive Coverage Analysis", style={'textAlign': 'center'}),
# Service Type dropdown and Year slider
html.Div([
dcc.Dropdown(
id='service-type-dropdown',
options=service_types,
value=cleaned_dataset['Service Type'].unique()[0], # Default to the first service type
placeholder="Select a Service Type",
style={'width': '48%', 'display': 'inline-block', 'margin-right': '2%'}
),
dcc.Slider(
id='year-slider',
min=cleaned_dataset['Year'].min(),
max=cleaned_dataset['Year'].max(),
step=1,
value=cleaned_dataset['Year'].min(),
marks={year: str(year) for year in range(cleaned_dataset['Year'].min(), cleaned_dataset['Year'].max() + 1)},
tooltip={"placement": "bottom", "always_visible": True},
)
]),
# Graph containers
dcc.Graph(id='top-20-population-graph'),
dcc.Graph(id='least-50-population-graph'),
])
# Callback to update graphs
@app.callback(
[Output('top-20-population-graph', 'figure'),
Output('least-50-population-graph', 'figure')],
[Input('service-type-dropdown', 'value'),
Input('year-slider', 'value')]
)
def update_graphs(selected_service_type, selected_year):
# Filter data by year and service type
filtered_data = cleaned_dataset[(cleaned_dataset['Year'] == selected_year) &
(cleaned_dataset['Service Type'] == selected_service_type)]
# Sort by population
sorted_data = filtered_data.sort_values(by='Population', ascending=False)
# Top 20 countries with the highest population
top_20_data = sorted_data.head(20)
fig1 = px.scatter(
top_20_data,
x='Coverage',
y='Population',
color='Country',
size='Population',
hover_name='Country',
title=f"Top 20 Countries by Population for {selected_service_type} in {selected_year}",
labels={'Coverage': 'Coverage (%)', 'Population': 'Population'}
)
fig1.update_layout(legend_title_text="Country", legend_orientation="v")
# Bottom 50 countries with the least population
least_50_data = sorted_data.tail(50)
fig2 = px.scatter(
least_50_data,
x='Coverage',
y='Population',
color='Country',
size='Population',
hover_name='Country',
title=f"50 Countries with Least Population for {selected_service_type} in {selected_year}",
labels={'Coverage': 'Coverage (%)', 'Population': 'Population'}
)
fig2.update_layout(legend_title_text="Country", legend_orientation="v")
return fig1, fig2
# Run the app
app.run_server(mode='inline', port=8054)
C:\Users\Bhargavi Jahagirdar\anaconda3\Lib\site-packages\dash\dash.py:579: UserWarning: JupyterDash is deprecated, use Dash instead. See https://dash.plotly.com/dash-in-jupyter for more details.
GDP vs Coverage
from jupyter_dash import JupyterDash
from dash import dcc, html, Input, Output
import plotly.express as px
import pandas as pd
# Initialize the app
app4 = JupyterDash(__name__)
# Dropdown options
service_types = [{'label': service, 'value': service} for service in cleaned_dataset['Service Type'].unique()]
# App layout
app4.layout = html.Div([
html.H1("Interactive GDP vs. Coverage Analysis", style={'textAlign': 'center'}),
# Service Type dropdown and Year slider
html.Div([
dcc.Dropdown(
id='service-type-dropdown',
options=service_types,
value=cleaned_dataset['Service Type'].unique()[0], # Default to the first service type
placeholder="Select a Service Type",
style={'width': '48%', 'display': 'inline-block', 'margin-right': '2%'}
),
dcc.Slider(
id='year-slider',
min=cleaned_dataset['Year'].min(),
max=cleaned_dataset['Year'].max(),
step=1,
value=cleaned_dataset['Year'].min(),
marks={year: str(year) for year in range(cleaned_dataset['Year'].min(), cleaned_dataset['Year'].max() + 1)},
tooltip={"placement": "bottom", "always_visible": True},
)
]),
# Graph containers
dcc.Graph(id='top-20-gdp-graph'),
dcc.Graph(id='least-50-gdp-graph'),
])
# Callback to update graphs
@app4.callback(
[Output('top-20-gdp-graph', 'figure'),
Output('least-50-gdp-graph', 'figure')],
[Input('service-type-dropdown', 'value'),
Input('year-slider', 'value')]
)
def update_graphs(selected_service_type, selected_year):
# Filter data by year and service type
filtered_data = cleaned_dataset[(cleaned_dataset['Year'] == selected_year) &
(cleaned_dataset['Service Type'] == selected_service_type)]
# Sort by GDP
sorted_data = filtered_data.sort_values(by='GDP per capita (current US$)', ascending=False)
# Top 20 countries with the highest GDP
top_20_data = sorted_data.head(20)
fig1 = px.scatter(
top_20_data,
x='Coverage',
y='GDP per capita (current US$)',
color='Country',
size='GDP per capita (current US$)',
hover_name='Country',
title=f"Top 20 Countries by GDP for {selected_service_type} in {selected_year}",
labels={'Coverage': 'Coverage (%)', 'GDP': 'GDP (in billions)'}
)
fig1.update_layout(legend_title_text="Country", legend_orientation="v")
# Bottom 50 countries with the least GDP
least_50_data = sorted_data.tail(50)
fig2 = px.scatter(
least_50_data,
x='Coverage',
y='GDP per capita (current US$)',
color='Country',
size='GDP per capita (current US$)',
hover_name='Country',
title=f"50 Countries with Least GDP for {selected_service_type} in {selected_year}",
labels={'Coverage': 'Coverage (%)', 'GDP': 'GDP (in billions)'}
)
fig2.update_layout(legend_title_text="Country", legend_orientation="v")
return fig1, fig2
# Run the app
app4.run_server(mode='inline', port=8055)
C:\Users\Bhargavi Jahagirdar\anaconda3\Lib\site-packages\dash\dash.py:579: UserWarning: JupyterDash is deprecated, use Dash instead. See https://dash.plotly.com/dash-in-jupyter for more details.